Authorship attribution of SMS messages using an N-grams approach

نویسندگان

  • Ashwin Mohan
  • Ibrahim M. Baggili
  • Marcus K. Rogers
  • Danielle Jones
چکیده

The pervasive use of SMS is increasing the amount of digital evidence available on cellular phones. Consequently it has become important to detect SMS authors, as a post-hoc analysis technique deemed useful in criminal persecution cases. This paper investigates an N-grams based approach for determining the authorship of SMS messages. Despite the scarcity of words in SMS messages and the differences between SMS language and natural language characteristics, the chosen method shows encouraging results in identification of authors. In this paper the effects of the gram size and the similarity scoring technique on the prediction of SMS message authors are also examined.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram

Due to its convenience and low–cost, short message service (SMS) has been a very popular medium for communication for quite some time. Unfortunately, however, SMS messages are sometimes used in illicit acts, such as communication between drug dealers and buyers, extortion, fraud, scam, hoax, false reports of terrorist threats, and many more. This study is a forensic study on the authorship clas...

متن کامل

Local n-grams for Author Identification Notebook for PAN at CLEF 2013

Our approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year’s competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams that best represent a particular author’s writin...

متن کامل

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

N-gram-based Author Profiles for Authorship Attribution

We present a novel method for computer-assisted authorship attribution based on characterlevel n-gram author profiles, which is motivated by an almost-forgotten, pioneering method in 1976. The existing approaches to automated authorship attribution implicitly build author profiles as vectors of feature weights, as language models, or similar. Our approach is based on byte-level n-grams, it is l...

متن کامل

Authorship Attribution in Portuguese Using Character N-grams

For the Authorship Attribution (AA) task, character n-grams are considered among the best predictive features. In the English language, it has also been shown that some types of character n-grams perform better than others. This paper tackles the AA task in Portuguese by examining the performance of different types of character n-grams, and various combinations of them. The paper also experimen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010